COMBINATORICS OF GEOMETRICALLY DISTRIBUTED 
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Abstract. For words of length n, generated by independent geometric random vari- 
ables, we consider the mean and variance of the number of inversions and of a param- 
eter of Knuth from permutation in situ. In this way, g-analogues for these parameters 
from the usual permutation model are obtained. 



1. Introduction 

Let X denote a geometrically distributed random variable, i. e. F{X = k} = pq k ~ 1 for 
k G N and q = 1 — p. The combinatorics of n geometrically distributed independent 
random variables Xi, . . . , X n has attracted recent interest, especially because of appli- 
cations in computer science. We mention just two areas, the skip list || [L8|, ^J, [L2[ [19|, || 



and probabilistic counting [g, [10], [LJ], [LJ 

In [^DJ the number of left-to-right maxima was investigated for words x± . . . x n , where 
the letters Xi are independently generated according to the geometric distribution. In 



14|| the study of left-to-right maxima was continued, but now the parameters studied 
were the mean value and mean position of the r-th maximum. 

In || runs of consecutive equal letters in a string of n geometrically distributed inde- 
pendent random letters were studied. 

In the present paper we deal with the number of inversions. This parameter is well 



understood in the context of permutations, see e. g. JT7[- An inversion in a word 
X\ . . . x n is a pair 1 < i < j < n such that Xi > Xj. In section ^ we compute average 
and variance of this parameter. Interestingly, if we perform the limit q — > 1 in these 
answers, we get exactly the same formulae as in the model of permutations. 

Another parameter related to pairs of indices in a permutation is the parameter a that 
was studied by Knuth in the context of an algorithm to permute a file in situ [IB|, 
compare also . This parameter is defined as 

a = I 1 < i < J < n > x i = min{xi,x i+ i, . . . ,Xj}}\. 

In this more complicated example, surprisingly, the limiting case q — > 1 again gives 
exactly the formulae from the model of permutations (see Section [|). 
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Thus the examples treated in this paper can be interpreted as g-analogues of the two 
parameters. 

2. The number of inversions 

The probability that a random word of length n, produced by indepent geometric 
random variables, has k inversions, is given as the coefficient of v k in 

il,...,i n >l l<j<k<n 

Here, [P] is a characteristic function, being 1 when condition P is satisfied and 
otherwise. This is the notation of Iverson, being popularized by M. The form of 
this generating function is merely a reformulation of the definition of the number of 
inversions. 

The expected value is obtained as E = /'(l), which is 

il,...,i„>l l<j<k<n 

= G)(f)'E 

V 7 y ii,t 2 >l 




Now we are going to compute the second factorial moment E-, which is obtained by 
E- = / (1), since the variance V is given by E- + E — E 2 ; 

El =(-) n E ^ + '" +ln E h>ikWi>i m i 

^ ii,...,t n >l l<j<k<n,l<l<m<n,(j,k)^(l,m) 

There are several possibilities for (j,k) ^ to hold, yielding several contribution 

to E- = Ej + E§ + E§ + Ef . 

First, all 4 indices might be mutually different; 
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E ?= © ( n 2 2 )(|) J X > yp. > y 

-OCT)© 4 e 



'ra\ /n - 2\ g 2 
The second contribution stems from j — I, k 7^ m: 



E| = 2^V^) 3 ]T g il+ ^ 3 [ii>i 2 ][ii>i3] 

^ U,«2,«3>1 



ii>»2>l,»i>»3>l 
/ // \ nil - - t '~ 



n\ q(l + q 2 



3J (1 + g)(l + g + g 2 )' 
The third contribution originates from j 7^ I, k = m: 



Ei = 2 



(-) 3 E ^ +l2+l3 Pl>^]p2>^] 



n\ fp\ 3 
3j \q 



ii,i2,i3>l 



3 7 Vo, 

7 U>i3>l,i2>l3>l 



3; 1 + q + g 2 

Finally, the two cases j < k = I < m and I < m = j < k can be combined by 
symmetry; 

^ =2 (s)© 3 ^ ^ + ' i2+i3 Pl>^]p2>^] 



U,«2,*3>1 

\ r/'V' V" „ii+w+i 3 



n>*2>i3>i 

2 



3; (1 +q)(l + q + q 2 ) 



Altogether we find 

n(n — l)(n — 2) 



E^= Hn Z) (3nq(l + q + q 2 ) + 3q 3 + 7q 2 -q + 4) 



(l + qy(l + q + q 2 )' 
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The variance is thus 

v= n( 7"(i + ,).(i' +t ^ ( 2 "-' +a> - , ' +7 '- 1 )' 

Summarizing, we obtain the following theorem. 

Theorem 1. The average and the variance of the number of inversions in a random 
word of length n obtained by independent geometric random variables with probabilities 
¥{X = k} = pq k ~ x , are given by 



E= n(n-1) q 



2 1 + q' 



n(n — 1) q , , , > 

For fixed q and n —>■ oo, we find 



V 



2 1 + q' 
n 3 q(l — q + q 2 ) 



3 (l + q) 2 (l + q + q 2 ) 
On the other hand, for q — 1, our formulae turn into 



^ n(n — 1) 
E = 



V 



4 ' 
n(n - l)(2n + 5) 
72 ' 



and these are exactly the formulae for the instance of permutions, compare e. g. [17 



3. Knuth's parameter from permutation in situ 
This time, the generating function of interest is 

^)=(^) n E 1 H+ '" +tn II (lij = ™n{i j ,---Mh + li j ^™m{i j ,...,i k }j). 



il,...,i„>l t<j<k<n 



Again, this is not really a useful generating function, but merely a direct translation 
of the definition. Nevertheless we find it appropriate in order to control the rather 
unwieldy expression. 

As always, the expected value is again obtained via E = /'(l); 
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E =(D" E ^ + '" + " E = minfe . . . ,i fe }] 
u,...,i„>i i<i<^<™ 

- E E * iJ+H 

l<7<fc<n ij=m 

E (fT'S- 



l<j<k<n H i>l 1 

p E 

l<j<k<n H i>l 



k+l-j 



p E 

p E ( n + 1 - /i )r^- 



2<h<n 

And the second factorial moment is again obtained by a second derivative; 



= (D B E ^ i+ - +i -x 



E- 

'-9 



ii,...,i n >l 
X 



[ij- = min{ij, ...,**}] [ij = min{z,, . . . ,z m }]. 

l<j<fc<n,l<J<m<Ti,(j,fc)^(Z,m) 

Now there are even more cases to be considered. We might have disjoint intervals, 
overlapping intervals or one interval being included in the other. Or, two indices might 
coincide, resulting in two intervals glued together or again one interval being included 
in the other with either a common left or right endpoint. 

Assume first that l<j<k<l<m<n. The corresponding contribution turns out 
to be 



y —± !_. 

1 i _ qk+i-j i _ qm+i-i 

l<j<k<l<m<n ^ y 

Observe that in general 

fn + 2 - i - j\ 

ctk+i-jCim+i-i - aia i I 2 / 

l<j<k<l<m<n 2<i,j'<n-2; i+j<n \ ~ / 

The next range is given by 1 < j < I < m < k < n, with a contribution 



p 2 



y L_ 

Z-^/ i — qk+i-j i _ qm+i- 

l<j<Km<k<n 
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We have the general formula 



^2 a k+ i_ja m+ i„i = ^2 a i°'j(n + 1 - - i - 1). 

l<j<Km<k<n 2<i<j<n 

For the range l<j<l<fc<m<nwe obtain the contribution 

\ \ ( m - I - l) 

I — qm+l-j i _ qm+l-l v >' 

i<Km< 

Observe again that in general 



l<j<Km<n 



^2 a m+1 _ j a m+1 _ / (m - I - 1) = ^ + 2). 

l<7<i<m<n 3<i<jr'<n 

Now the first range with 3 indices involved isl<j</c = /<m<n with a 
contribution 

1 1 

m+l—k ' 



p 2 y — - — 

1 ^ 1 _ nm+l-j i 

l<j<k<m<n ^ 

Again, such a sum can be rearranged in general; 



l<j<k<m<n 2<i<j<n 

The next range l<j<l<m=k<n gives a contribution 

P * y ! !_. 

^ ]_ _ qm+l-j I _ qm+l-l 

2<j<Km<n H H 

Here we note also a general formula; 

^ am+l-jdm+l-l = ^2 a i a j( n + 1 -j)- 

l<j<Km<n 2<i<j<n 

The last range l<j = l<k<m<n gives a contribution 

m — j — 1 



P 



El ^ 771 ~ 

1 _ qm +i-j ~ P 1^ l _ q 



m+l-j ' 



l<j<k<m<n l<j<m<n 

Observe that in general 
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E ( m ~j- l)Om+l-j = E ^ - 2 )(^ + 1 - «)■ 

l<j<m<n 3<i<ra 

All these contributions come with a factor 2, because of symmetry. 
Thus 

I E ( 2 )=p 2 V -i Wn + 2-i-j 

2 ^ ^ 1 - g* 1 - ^ V 2 

1 1 



V V r :(n+l-7')(i-2) 



+p 2 E 

2<i<j<n H H 

1 1 

r 

3<«<j<n 

+ E 1 1 n 1 7 - ( w + 1 ~j) 
\ — q % \ — q 3 

2<i<j<n ^ y 

+ P 2 y ; 1 • -, 1 ■ (ra + 1 - j) 

2<i<j<n * * 



3<i<n ^ 

After several tedious simplifications we arrive at this form; 

y 1 . 1 ( n+2 - m )- 2 p y ^±1 

^ 1 - q l 1 - f- { V 2 / ^ ^ 1 - 



+ 2 P 2 E 1 \ H _ 1 qJ (^+ 1 -j)(j-l)+n(n + l) 

l<i<i<n ^ ^ 

However, we can still do better than that by noting that 

1 ^ f « 



5^ (1 - cfMl - cT-*) ^ VI 



(1 - 0^(1 - q™-*) ^ VI -<? l-q m - i Jl-q r ' 

\<i<m v ^ /y * ' l<i<m v ^ 1/1 



m — 1 



2 x ^ 1 

1 - a m 4^ 1 - a r 



1 _ q m \-„m ^ l_ q i 
* * Kkm 



Thus 



m — 1 fn + 2 — m\ (n + 1 — j) 



/ \ o \ //t — i / i T 4 — lit. \ v — \ [ it T 

l<m<n ~* \ / l<j<n 

~ ■> > - - n + 1— j , v 
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Theorem 2. The average and the variance of Knuth's parameter from the permutation 
in situ problem for random words of length n obtained by independent geometric random 
variables with probabilities F{X = k} = pq k ~ l , are given by 

^ n+l-i 

E = P > — ■ n, 

1 — q l 

l<i<n * 

y ~ Z P 2^ i- q m{ 2 +ZP 2^ (i_^)(i_ gJ ) 

l<m<n i \ / l<i<j<n v ^ / v ^ ' 

2 v (n+l-^) 2 ^ (n+l-2)(22-l) 
P ^ (l-g*) 2 P ^ l-q l 

l<i<n v ^ y l<j<n ^ 

As a corollary, let us evaluate these quantities for fixed q and n — > oo. 
For this purpose, we need two infinite series: 

a:= ^f-i' ^ := _ 1)2 ■ 

i>l ^ i>l v ^ 7 



Then 

E = p( n 2 1 )+Pn|:-^ T -n + 0(l) 

= |n 2 + (|-l+pa)n + e>(l). 

For the variance, the computations are a bit more complicated. We treat the sums 
separately: 

Em— 1 /n + 2 — m\ n 4 n 3 n 2 n 2 ^ m — 1 
l-q m 1 2 /~24 + 12~24 + T^ g-" 1 - 1 + ^ 

\<rn<n H V 7 m>l y 

4 3 2 2 

n n n n „ „„. , 

Ei(n + f — 7) n 4 tt 3 n 2 n 2 , m , , 

^) = a + 12-24 + T (a + ^ ) + ° ( " ); 

sr ( n + l ~ i ) 2 _ n3 , ^ 2 , (^ + i-*) 2 , (^ + i-*) 2 , 

2^ (i_ gi )2 - 3 + 2 +^ 2^ q -i_ 1 + 2^ _ 1)2 

3 2 

= — + — + 2n 2 a + n 2 (3 + C(n); 

E ( " +1 1 -_" ( , 2 ' : - 1) 4 + ^ + (, i) . 

l<j<n ^ 
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Collecting we find 

V = ^n 3 + - p 2 (a + (3))n 2 + 0(n). 

Now we consider the limit q — > 1. 
For the expectation we easily get 

E = (n + l)H n - 2n. 

For the variance we get 

n + l-j ^ (n + l-i) 2 ^ (n+l-i)(2i-l) 



V = 2 J - 2 + E 

l<*<j<™ ^ l<i<n ' l<i< 

(n + 1 -j)(4j -3) (n + 1-z) 2 



E l" -t- 1 - ja^-v ~ °; _ 

l<j<n J l<i<n 

= -(n + l)H n - (n + l) 2 H {2) + 2n(n + 2). 

Here we only used standard summations involving harmonic numbers, as treated e. g. 
in |15[ 0] . (Recall that the harmonic numbers of first and second order are defined by 

l<fc<n l<A;<n 

respectively.) 

Thus, in the limiting case, expectation and variance are exactly the same as in the 
permutation model, compare |n| ^21 |J. 

Some other g-analogues of harmonic numbers can be found e. g. in |fj. [IJ . 
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